difference feature
Multi-Scale Temporal Difference Transformer for Video-Text Retrieval
Wang, Ni, Liao, Dongliang, Xu, Xing
Currently, in the field of video-text retrieval, there are many transformer-based methods. Most of them usually stack frame features and regrade frames as tokens, then use transformers for video temporal modeling. However, they commonly neglect the inferior ability of the transformer modeling local temporal information. To tackle this problem, we propose a transformer variant named Multi-Scale Temporal Difference Transformer (MSTDT). MSTDT mainly addresses the defects of the traditional transformer which has limited ability to capture local temporal information. Besides, in order to better model the detailed dynamic information, we make use of the difference feature between frames, which practically reflects the dynamic movement of a video. We extract the inter-frame difference feature and integrate the difference and frame feature by the multi-scale temporal transformer. In general, our proposed MSTDT consists of a short-term multi-scale temporal difference transformer and a long-term temporal transformer. The former focuses on modeling local temporal information, the latter aims at modeling global temporal information. At last, we propose a new loss to narrow the distance of similar samples. Extensive experiments show that backbone, such as CLIP, with MSTDT has attained a new state-of-the-art result.
ShapeFormer: Shapelet Transformer for Multivariate Time Series Classification
Le, Xuan-May, Luo, Ling, Aickelin, Uwe, Tran, Minh-Tuan
Multivariate time series classification (MTSC) has attracted significant research attention due to its diverse real-world applications. Recently, exploiting transformers for MTSC has achieved state-of-the-art performance. However, existing methods focus on generic features, providing a comprehensive understanding of data, but they ignore class-specific features crucial for learning the representative characteristics of each class. This leads to poor performance in the case of imbalanced datasets or datasets with similar overall patterns but differing in minor class-specific details. In this paper, we propose a novel Shapelet Transformer (ShapeFormer), which comprises class-specific and generic transformer modules to capture both of these features. In the class-specific module, we introduce the discovery method to extract the discriminative subsequences of each class (i.e. shapelets) from the training set. We then propose a Shapelet Filter to learn the difference features between these shapelets and the input time series. We found that the difference feature for each shapelet contains important class-specific features, as it shows a significant distinction between its class and others. In the generic module, convolution filters are used to extract generic features that contain information to distinguish among all classes. For each module, we employ the transformer encoder to capture the correlation between their features. As a result, the combination of two transformer modules allows our model to exploit the power of both types of features, thereby enhancing the classification performance. Our experiments on 30 UEA MTSC datasets demonstrate that ShapeFormer has achieved the highest accuracy ranking compared to state-of-the-art methods. The code is available at https://github.com/xuanmay2701/shapeformer.
Difference-based Deep Convolutional Neural Network for Simulation-to-reality UAV Fault Diagnosis
Zhang, Wei, Tong, Junjie, Liao, Fang, Zhang, Yunfeng
Identifying the fault in propellers is important to keep quadrotors operating safely and efficiently. The simulation-to-reality (sim-to-real) UAV fault diagnosis methods provide a cost-effective and safe approach to detect the propeller faults. However, due to the gap between simulation and reality, classifiers trained with simulated data usually underperform in real flights. In this work, a new deep neural network (DNN) model is presented to address the above issue. It uses the difference features extracted by deep convolutional neural networks (DDCNN) to reduce the sim-to-real gap. Moreover, a new domain adaptation method is presented to further bring the distribution of the real-flight data closer to that of the simulation data. The experimental results show that the proposed approach can achieve an accuracy of 97.9\% in detecting propeller faults in real flight. Feature visualization was performed to help better understand our DDCNN model.
Detecting and Correcting Adversarial Images Using Image Processing Operations
Nguyen, Huy H., Kuribayashi, Minoru, Yamagishi, Junichi, Echizen, Isao
ABSTRACT Deep neural networks (DNNs) have achieved excellent performance on several tasks and have been widely applied in both academia and industry. However, DNNs are vulnerable to adversarial machine learning attacks, in which noise is added to the input to change the network output. We have devised an image-processing-based method to detect adversarial images based on our observation that adversarial noise is reduced after applying these operations while the normal images almost remain unaffected. In addition to detection, this method can be used to restore the adversarial images' original labels, which is crucial to restoring the normal functionalities of DNN-based systems. Testing using an adversarial machine learning database we created for generating several types of attack using images from the ImageNet Large Scale Visual Recognition Challenge database demonstrated the efficiency of our proposed method for both detection and correction.